Transducer Composition for “on-the-fly” Lexicon and Language Model Integration

نویسندگان

Diamantino Caseiro

Isabel Trancoso

چکیده

In this work we present the use of a specialized composition algorithm that allows the generation of a determinized search network for ASR in a single step. The algorithm is exact in the sense that the result is determinized when the lexicon and the language model are represented as determinized transducers. The composition and determinization are performed simultaneously, which is of great importance for “on-the-fly” operation. The algorithm pushes the language model weights towards the initial state of the network. Our results show that it is advantageous to use the maximum amount of information as early as possible in the decoding procedure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On integrating the lexicon with the language model

The goal of this work was to develop an algorithm for the integration of the lexicon with the language model which would be computationally efficient in terms of memory requirements, even in the case of large trigram models. Two specialized versions of the algorithm for transducer composition were implemented. The first one is basically a composition algorithm that uses the precomputed set of t...

متن کامل

Using Dynamic Wfst Composition for R

Our first application of weighted finite state transducers to the recognition of broadcast news provided us with an interesting framework to study several problems related to the optimization of the search space. The paper starts by describing how the use of our lexicon and language model “on-the-fly” composition algorithm is crucial in extending the transducer approach to large systems. We pre...

متن کامل

A tail-sharing WFST composition algorithm for large vocabulary speech recognition

This paper presents an algorithm for approximating minimization in the context of the weighted finite-state transducers approach to large vocabulary speech recognition. The algorithm is designed for the integration of the lexicon with the language model and performs composition, determinization and pushing in one step. Furthermore, it uses tail-sharing in order to approximate minimization. Our ...

متن کامل

Spoken Language Processing Using Weighted Finite State Transducers

The main goal of this paper is to illustrate the advantages of weighted finite state transducers (WFSTs) for spoken language processing, namely in terms of their capacity to efficiently integrate different types of knowledge sources. We shall illustrate their applicability in several areas: large vocabulary continuous speech recognition, automatic alignment using pronunciation modeling rules, g...

متن کامل

Pre-initialized composition for large-vocabulary speech recognition

This paper describes a modified composition algorithm that is used for combining two finite-state transducers, representing the context-dependent lexicon and the language model respectively, in large vocabulary speech recogntion. This algorithm is a hybrid between the static and dynamic expansion of the resultant transducer, which maps from context-dependent phones to words and is searched duri...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Transducer Composition for “on-the-fly” Lexicon and Language Model Integration

نویسندگان

چکیده

منابع مشابه

On integrating the lexicon with the language model

Using Dynamic Wfst Composition for R

A tail-sharing WFST composition algorithm for large vocabulary speech recognition

Spoken Language Processing Using Weighted Finite State Transducers

Pre-initialized composition for large-vocabulary speech recognition

عنوان ژورنال:

اشتراک گذاری